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Abstract 

We analyze the cumulative distribution of total personal income of USA counties, 
and gross domestic product of Brazilian, German and United Kingdom counties, 
and also of world countries. We verify that generalized exponential distributions, 
related to nonextensive statistical mechanics, describe almost the whole spectrum 
of the distributions (within acceptable errors), ranging from the low region to the 
middle region, and, in some cases, up to the power-law tail. The analysis over about 
30 years (for USA and Brazil) shows a regular pattern of the parameters appearing 
in the present phenomenological approach, suggesting a possible connection between 
the underlying dynamics of (at least some aspects of) the economy of a country (or 
of the whole world) and nonextensive statistical mechanics. We also introduce two 
additional examples related to geographical distributions: land areas of counties and 
land prices, and the same kind of equations adjust the data in the whole range of 
the spectrum. 
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1 Introduction 

Despite of the ubiquity of Gaussians in nature, there are many, also ubiqui- 
tous, examples of non-Gaussian distributions. Power-laws, for instance, appear 
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in a variety of physical, biological, psychological, and social/economical phe- 
nomena [1,2,3,4,5,6,7,8,9]. Frequently, such systems do not exhibit power-law 
behavior in the entire spectrum, but rather power-law tails. Characterization 
of economical systems, more specifically the distribution of personal income, 
are usually assumed to follow Pareto's law [1], p(x) oc x^~ a , in the large 
income region (typically 1 > a > 2), and a log- normal distribution [10], 
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in the middle (or low-middle) income region (p(x) is the probability density 
function, xq is a mean value and a 2 is a variance). See [11,12,13] for recent 
revisiting of this approach to the problem. 

It is already known the existence of connections between economical systems 
(financial markets) and nonextensive statistical mechanics [14] (see [15] for a 
recent review). In the present work we address a different feature of economical 
systems: the distribution of total personal income (PI) of counties, as well as 
total gross domestic product (GDP) of counties for a given country (both 
PI and GDP can be an index for the value added). We similarly consider 
distribution of GDP of the countries of the world. We use distributions that 
belong to the family of the g-exponential function [16,17], 



exp,(x) = [l + (l-g)a;];-« (2) 

(q e R, [•••]+ = max{- ■ ■ , 0}), that naturally emerge from nonextensive sta- 
tistical mechanics [18,19,20] (for recent reviews and updated bibliography, see 
Ref. [21,22,23]). g-Exponentials (with negative argument, which is the case 
we are interested in here; hereafter we will consider exp g (— x), with q > 1 
and x > 0) present asymptotic power-law tails, as many complex systems 
do. Along these lines, we are able to describe (almost) the whole spectrum of 
the distribution (and not only the tails) with a single function, which points 
towards an unified approach of the problem. In a certain sense, this problem 
resembles another one, namely the number of citations of scientific papers, 
which likewise present power-law behavior only at the tail. It was first con- 
jectured that different phenomena rule large-cited and low-cited papers (see 
Ref. [24] and references therein). A nonextensive approach to the problem 
[25] showed that it is possible to have a single function describing the whole 
spectrum of citations. 
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2 Distribution generators 



Let us formulate the problem by two alternative paths. One way of character- 
izing a distribution is through the variational approach, in which an entropy is 
maximized under constraints of normalizability and finiteness of a generalized 
moment of the distribution ((|x| 7 ) = constant, 7 > 0) (see, e.g., Ref. [26]). 
For instance, if we take into account Boltzmann-Gibbs entropy, 

S = -k B Jp(x)^p(x)dx, (3) 
submitted to the constraints of normalizability 

J p(x) dx = 1 (4) 
and finiteness of a certain momentum of order 7 

J Ixp p(x) dx < 00, (5) 
it comes out exponential forms (stretched exponentials), 

p(x) oc exp(— fix^), (6) 

with f3 being the Lagrange multiplier. Typically 7 = 1 when we are dealing 
with distributions of, e.g., energy, and 7 = 2 for distributions of space posi- 
tions, i.e., diffusion. For the sake of generality, we put not only integer values 
of 7, but rather 7GI; that's why it appears the modulus in Eq. (5) (although 
in the forthcoming examples we only consider 7 = 1 or 7 = 2, and also x > 0, 
which makes unnecessary the modulus). 7 = 1 yields exponentials, 7 = 2, 
Gaussians. 

If, instead, we take nonextensive entropy [18] 



, 1 - J\p(x)] q dx 

S q = k J -^-f (q G ffi), (7) 

(k is a non-negative constant, related and possibly equal to Boltzmann's 
constant ks), with the same normalizability constraint, Eq. (4), and a q- 
generalized version of the finiteness of the momentum of order 7, 
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x^ [p(x)] q dx < 00, (8) 
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then g-stretched exponentials appear: 



p(x) ocexp g (-/3gX 7 ) 

<x[l-(l-q)0 q xT]t*. (9) 

q — 1 recovers usual stretched exponentials, with f3 q = /3, Eq. (6). q ^ 1 and 
7 = 1 recovers the g-exponential itself, Eq. (2). Functions with 7 = 2 and 
q 7^ 1 can consistently be called g-Gaussians. Particular cases are, of course, 
the Gaussian distribution (g = 1), and the Lorentzian distribution (g = 2). 
This path was followed by Ref. [27,28]. 



An alternative way of characterizing a distribution is through the differential 
equation it satisfies. This path was developed in Ref. [29], within the frame- 
work of nonextensive statistical mechanics, but it was originally formulated in 
Planck's celebrated first papers on black-body radiation law [30], the birth of 
quantum mechanics. See Ref. [31] for a comprehensive review on this differ- 
ential equation approach, and also for historical remarks about Planck's first 
papers on black-body radiation. Stretched exponential distributions obey 



1 * -0P, (10) 



7a; 7 1 dx 



while nonextensive g-stretched exponentials follow a simple generalization of 
the former equation: 



1 * -p,p', (n; 



7a; 7 1 dx 
whose solution is given by Eq. (9). 

Some complex systems, e.g. re-association of oxygen in folded myoglobin [29], 
linguistics [32], cosmic rays [33,34,35], economical systems (distribution of 
returns of New York Stock Exchange [36]), and also the economical examples 
we are dealing here, exhibit not one, but two power-law regimes (i.e., two 
different slopes in a log-log plot, according to the value of the independent 
variable), with an usually marked crossover between them, sometimes referred 
to as the knee. Such behavior demands a probability distribution obeying a 
more general differential equation than the two former ones, namely 



7X 7 1 dx 
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(1 < q' < q, < j3 q > < f3 q ), where q and q' are connected to the two different 
slopes of the regimes (the asymptotic slope in a log-log plot is given by 7/(1 — 
q)). q controls the slope of the first (intermediate) power-law regime, and 
q', the second (the tail). The solution of Eq. (12) is expressed in terms of 
hypergeometrical functions (see Ref. [29] for the analytical expression), and 
we can consistently call such functions (q, g')-stretched exponentials. They 
naturally present a crossover (knee) at 



•^knee g-i • 

[{q>-l)(3 q ,]^ 

Notice that f3 q > = 0, or f3 q > = (3 q , or even q' = q recover the differential equation 
obeyed by g-stretched exponentials, Eq. (11), with Xknee — > 00. 

Another particular case of Eq. (12) is obtained with q' = 1. It is curious to note 
that this differential equation (with q' = 1) is known as Bernoulli's equation 
[37,38], and it is exactly the one used by Planck in his October 1900 paper 
[30] (with q = 2, q' = 1, 7 = 1) 1 . The distribution of this q' — 1 case is given 
by [29] 



1 

1-9 

(14) 



and presents an intermediate power-law regime, followed by an exponential 
tail, with a crossover at 



p(x) = 



1 _ Pq PqAq-l)Pix1 
Pi Pi 



^knee [(? _ ^ ■ (15) 

An example of system that may be described by this (q ^ 1, q' — 1) case is 
the measure of success of musicians [39]. 



It is also worth mention that Planck adopted this equation as a fitting procedure 
(and certainly with a great amount of physical intuition). In his words [30], "one 
gets a radiation formula with two constants . . . which, as far as I can see at the 
moment, fits the observational data, published up to now, as satisfactorily as the 
best equations put forward for the spectrum ..." 
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3 Geographical distribution of Total Personal Income and Gross 
Domestic Product 



We consider Eq. (12) with p = P being the inverse cumulative probability 
distribution, P(X > x) — dy p(y) (P(X > x) is the probability of finding 
the distribution variable with a value X equal to, or greater than, x), and 
x = x/xq is the ratio between an economical variable and its minimum value: 
in the discrete /x min , where x stands for the economical variable, 

in our analysis, PI of a county, or GDP of a county (or of a country). Index 
i refers to the county (or country), and min is the poorest (lowest ranking) 
county (country). 

We analyze one case of PI county distribution, USA counties (for years ranging 
from 1970 to 2000) [40], and three cases of GDP county distribution: Brazil- 
ian counties (from 1970 to 1996) [41], German counties (from 1992 to 1998) 
[42], and United Kingdom counties (from 1993 to 1998) [43]. (See [40] for the 
method of calculation of USA county PI.) All these cases are well described 
with 7 = 2, i.e., (q, g')-Gaussians. 

Fig. 1 illustrates the results with inverse cumulative distributions. Inverse cu- 
mulative distribution, or the rank, is equal to the number of counties N counties 
times P, with P given by the corresponding cumulative distribution probabil- 
ity. Three curves are shown in each Fig. l(a)-(d): (i) g-Gaussian distributions, 
which can describe low range data, (ii) (q, g')-Gaussian, which shows to be able 
to reproduce the low-middle range, including the knee, and (iii) log-normal dis- 
tributions, that were adjusted to fit middle range values. For USA and Brazil, 
we observe that the (q, g')-Gaussian describes the data in almost the entire 
range; for Germany and UK, both (q, g')-Gaussian and log-normal are able to 
describe the data in the low-middle region (the curves are practically visually 
indistinguishable in this region). For USA and Brazil, the log-normal distri- 
bution fails in the low region — see Inset of Fig.s 1(a) and 1(b). Values of the 
parameters are given in Table 1. 

At a first glance, it might seem that log-normal distributions are more par- 
simonious (and consequently, preferable) in the description of these problems 
than the (q, g')-Gaussians, once the former has two fitting parameters, while 
the later has four parameters. But when we look in detail to the problem, 
we realize that in many cases the log-normal is able to describe just the mid- 
dle range values of the distributions (sometimes low-middle range). Deciding 
where this middle range begins and where it ends works as if there were two 
additional hidden parameters in this log-normal distribution. When this hap- 
pens, both log-normal and (q, g')-Gaussians present the same fitting degree of 
freedom. 
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Table 1 

Parameters for the distribution functions, for the years shown in Fig. 1. 



Country 


Year 


■^counties 


Q 


q' 


Vv 7 ^ 




x 


a 


USA 


2000 


3110 


3.80 


1.7 


87.71 


2236.07 


110 


7 


Brazil 


1996 


4973 


3.50 


2.1 


40.82 


816.50 


22 


10 


Germany 


1998 


440 


2.70 


1.5 


3.16 


6.59 


3.5 


1.5 


UK 


1998 


133 


3.12 


1.4 


18.26 


37.80 


20 


1.5 



Large GDP range displays a different behavior: the distribution presents a sec- 
ond crossover, bending upwards and giving rise to a different (third) power-law 
regime. This effect is very pronounced for Germany, and in a smaller degree for 
UK, while for USA and Brazil, it is almost hidden in the binned distribution (as 
shown in Fig. 1), but it is visible with unbinned plots 2 . In USA, for instance, 
only the two major counties (Los Angeles and Cook (part of Chicago)) belong 
to this regime. Similarly, in Brazil, we have Sao Paulo and Rio de Janeiro 
within this regime. This feature is commonly exhibited by various systems, 
sometimes referred to as king effect [44] . It is also present in highly energetic 
cosmic rays, been referred to as ankle [33] (we adopt this nomenclature in the 
Figures). Such behavior is possibly related to nonequilibrium phenomena, or 
(at least in some cases) poor statistics, and lies outside the present approach. 
We recall that the number of counties in USA and Brazil is about one order of 
magnitude greater than that of Germany and UK, and this possibly is related 
to the more pronounced king effect in these last to countries. 

Fig. 2 shows temporal evolution of the parameter q. USA present an approxi- 
mately uniform increase of q over 30 years. In the case of Brazil, the tendency 
of increase from 1970 to 1990 was broken from 1990 to 1996. Germany and 
UK present constant values of q over the years for which there are available 
data. The increase of q (observed for USA and Brazil) indicates increasing 
inequality: the greater the q, long-lasting the tail, the greater the probability 
of finding counties much richer than others. The parameter q' (for a given 
country) is taken constant for all years. The smaller values of q and q' for 
Germany and UK, when compared to USA and Brazil, reflect the well bal- 
anced distribution of value added of these European countries, relative to the 
analyzed American countries. The relation between the slopes (related to q) 
and equality/inequality is not a new conclusion; it is known since Pareto [1] 
(see also Ref. [45] and references therein). 

2 In a binned distribution, the ordinate shows the number of data (normalized 
or not) that falls within a (usually small) region, or bin, in the abscissa. In the 
distributions shown in Fig. 1, the bins are logarithmic equally spaced, i.e., their 
width are exponentially increasing. In unbinned distributions, each point in the 
figure corresponds to an original data. Binned distribution was chosen in Fig. 1 for 
better visualization. 
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We have also analyzed the world GDP country distribution, for the year 2000 
[46]. In this case, we found that a (q, g')-exponential (7 = 1) fits better the 
data than a (q, g')-Gaussian (7 = 2) in the low-middle region. Although the 
difference between the two functions (with 7 = 1 and 7 = 2) is perceptible, 
it is not that much big. This purely phenomenological observation deserves 
further investigation to corroborate or not our results. If it is confirmed to be 
7 = 1, a possible interpretation might be due to the nature of interactions 
between countries, which is expected to differ from the interactions between 
counties inside a country. Fig. 3 shows the results. The king effect is also 
present here, particularly for the two major GDP countries, USA and Japan. 



4 Distribution of land areas and land markets 

In this section we add two different examples, related to geographical distribu- 
tions: (i) distribution of land areas of Brazilian counties, and (ii) distribution 
of Japan land prices. 

Let us focus on the first example, illustrated by Fig. 4. The minor Brazilian 
county has 2.9 km 2 (Santa Cruz de Minas, in the State of Minas Gerais), and 
the major one has 161446 km 2 (Altamira, in the State of Para, within the 
Amazon forest) [47]. There are many causes for a county to have a given area, 
including, among others, geographical, political, demographical and econom- 
ical factors. The (q, g')-Gaussian fits (within an acceptable error) practically 
all county areas (more than 5500, in the year 1998), from the smaller up to 
the greater. 

Now let us consider the problem of Japan land prices, recently addressed [48]. 
The author found a power-law tail for the cumulative probability distribution 
of land price, with a slope of —1.7 (P(X > x) oc x^ 1 - 7 ). Fig. 5 makes evident 
that the g-Gaussian (with (3 q > = 0) fits the whole range of data (except the 
point with the higher price) and not only the tail (we recall that the probability 
distribution, from [48], is binned — maybe with unbinned data, we could find 
/3 q i 7^ 0, and with this, it would possibly include the last point (s) under the 
curve). 



5 Final remarks 

Finally we would like to point out that g-Gaussians and log-normal distribu- 
tions seem to be equally able to describe the data in the low-middle region. 
As these data usually range not so many decades, one cannot unequivocally 
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decide the 'correct' distribution by simple comparison, and such phenomeno- 
logical approaches (ours included) only give hints about the underlying dynam- 
ics of the economy. g-Gaussians present power-law tails (log-normal distribu- 
tions don't), which is an important characteristic when dealing with complex 
systems. The good concordance of the present procedure with data, includ- 
ing the smoothness of temporal evolution of the parameters, together with 
previous works along these lines [14,15], may suggest a new path for inves- 
tigating economical relations, namely the development of models based on 
the framework of nonextensive statistical mechanics. Such g-distributions (or 
(q, g')-distributions) appear when long-range interactions, long-term memory, 
(multi)fractality and/or small- world networking are present, expected features 
in complex systems (including economical ones). This further central step (de- 
velopment of models that essentially describe the underlying microscopic dy- 
namics of the systems) is certainly not an easy task. The answer (as suggested 
in Ref. [31]) may come from Barabasi- Albert's approach to the problem of 
small- world networks [49,50], considering preferential attachment of new ver- 
tices, that are added to a network, to sites that are already well connected, 
leading to scale-free power-law distributions. 

These examples join many others (some cited in the Introduction) that are 
fairly fitted by equations related to nonextensive statistical mechanics. Of 
course these are only empirical observations, but there is a significant amount 
of evidence that nonextensivity and complex systems are intimately connected 
(we don't mean that all complex systems might be somehow related to nonex- 
tensivity, but it is possible that such complex systems may be divided into 
classes of universality, some of them presenting a nonextensive nature). Be- 
sides, it is already known how to estimate a priori the q index for some classes 
of systems, namely low- dimensional dissipative maps, based on its dynamics 
(see [51,52] and references therein). Of course these are much more simple 
systems than the ones we are dealing with, but many efforts have being made 
to extend this predictive nature to other more complex systems (see also [31]). 
These results point towards the interpretation that the nonextensive indexes 
q and q' are not merely fitting parameters, but they are related to more fun- 
damental dynamic features of the problem. With this point of view in mind, 
the present proposed approach, at the moment being just a phenomenological 
observation, may happen to be of a more fundamental nature, and not simply 
a fitting procedure. 
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Fig. 1. Binned inverse cumulative distribution of county PI/PLj (USA) and 
GDP/GDPo (Brazil, Germany and UK). Three distributions are displayed for com- 
parison: (i) g-Gaussian (with (5 q > = 0) (dot-dashed), (ii) (q, g')-Gaussian (solid), and 
(hi) log- normal (dashed lines). Figures (a) and (b) present Insets with linear-linear 
scale, to make more evident the quality of the fitting at the low region (In Fig.s (c) 
and (d), the (g,(/)-Gaussian and the log-normal curves are superposed and so are 
visually indistinguishable). The positions of the knees (according to Eq. (13)) are 
indicated. The ankle is particularly pronounced in (c), though it is also present in 
the other cases. 
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Fig. 2. Evolution of parameter q for USA (squares), Brazil (circles), UK (up tri- 
angles) and Germany (down triangles). The parameters q' (for each country) are 



constant for all years: q' Brazi i 
are only guide to the eyes. 
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Fig. 3. Inverse cumulative distribution of GDP/GDPo of 167 countries for the 
year 2000 (unbinned data: each point corresponds to a country). The data are 
fitted with (q, ^-exponential (solid) and log-normal (dashed line) distributions - 
they are visually indistinguishable for this example, (/-exponential (with (3 q i = 0, 
dot-dashed) is also shown for comparison. Values of the parameters are q = 3.5, 
q' = 1.7, l/(3 q = 111.1, l/(3 q > = 2500.0. The knee, according to Eq. (13), is located 
at GDP/GDPq = 19 665. Log-normal curve with x = 220 and a = 13. 
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Fig. 4. Inverse cumulative distribution of land areas of Brazilian counties (unbinned 
data). Solid line is a (q, g')-Gaussian. q = 3.07, q' = 1.56, l/yffi^ = 353.55 km 2 , 
1/^7 = 11226.7 km 2 . 
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Fig. 5. Inverse cumulative probability distribution of Japan land prices for the year 
1998. The data (binned) were taken from Fig. 1 of [48]. Solid curve is a (/-Gaussian 
with q = 2.136, which corresponds to the slope —1.76 (found by the Author of [48]), 
and l/^fWq = 188 982 Yen. 
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